Fix FA tutorial #485

zhanglx13 · 2024-01-25T18:38:37Z

Check correctness for fp8 inputs only when torch supports it
Only run benchmark in fp16

- Check correctness for fp8 inputs only when torch supports it - Only run benchmark in fp16

xiaohuguo2023 · 2024-01-25T19:24:27Z

passed on my test on MI250:
collected 28 items
Running 28 items in this shard: tutorials/06-fused-attention.py::test_op_fwd[False-4-48-1024-64-fp16], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-1024-64-fp8], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-2048-64-fp16], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-2048-64-fp8], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-4096-64-fp16], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-4096-64-fp8], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-1024-128-fp16], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-1024-128-fp8], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-2048-128-fp16], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-2048-128-fp8], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-4096-128-fp16], tutorials/06-fused-attention.py::test_op_fwd[False-4-48-4096-128-fp8], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-1024-64-fp16], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-1024-64-fp8], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-2048-64-fp16], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-2048-64-fp8], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-4096-64-fp16], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-4096-64-fp8], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-1024-128-fp16], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-1024-128-fp8], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-2048-128-fp16], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-2048-128-fp8], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-4096-128-fp16], tutorials/06-fused-attention.py::test_op_fwd[True-4-48-4096-128-fp8], tutorials/06-fused-attention.py::test_op_bwd[4-48-1024-64], tutorials/06-fused-attention.py::test_op_bwd[4-48-2048-64], tutorials/06-fused-attention.py::test_op_bwd[4-48-4096-64], tutorials/06-fused-attention.py::test_op_bwd[1-16-8192-64]

06-fused-attention.py::test_op_fwd[False-4-48-1024-64-fp16] PASSED
06-fused-attention.py::test_op_fwd[False-4-48-1024-64-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[False-4-48-2048-64-fp16] PASSED
06-fused-attention.py::test_op_fwd[False-4-48-2048-64-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[False-4-48-4096-64-fp16] PASSED
06-fused-attention.py::test_op_fwd[False-4-48-4096-64-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[False-4-48-1024-128-fp16] PASSED
06-fused-attention.py::test_op_fwd[False-4-48-1024-128-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[False-4-48-2048-128-fp16] PASSED
06-fused-attention.py::test_op_fwd[False-4-48-2048-128-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[False-4-48-4096-128-fp16] PASSED
06-fused-attention.py::test_op_fwd[False-4-48-4096-128-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[True-4-48-1024-64-fp16] PASSED
06-fused-attention.py::test_op_fwd[True-4-48-1024-64-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[True-4-48-2048-64-fp16] PASSED
06-fused-attention.py::test_op_fwd[True-4-48-2048-64-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[True-4-48-4096-64-fp16] PASSED
06-fused-attention.py::test_op_fwd[True-4-48-4096-64-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[True-4-48-1024-128-fp16] PASSED
06-fused-attention.py::test_op_fwd[True-4-48-1024-128-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[True-4-48-2048-128-fp16] PASSED
06-fused-attention.py::test_op_fwd[True-4-48-2048-128-fp8] SKIPPED
06-fused-attention.py::test_op_fwd[True-4-48-4096-128-fp16] PASSED
06-fused-attention.py::test_op_fwd[True-4-48-4096-128-fp8] SKIPPED
06-fused-attention.py::test_op_bwd[4-48-1024-64] PASSED
06-fused-attention.py::test_op_bwd[4-48-2048-64] PASSED
06-fused-attention.py::test_op_bwd[4-48-4096-64] PASSED
06-fused-attention.py::test_op_bwd[1-16-8192-64] PASSED

================= 16 passed, 12 skipped in 340.48s (0:05:40) =================

xiaohuguo2023

good to go

zhanglx13 force-pushed the fix_fa_tutorial branch from cc13f3a to a00c25d Compare January 25, 2024 18:51

Fix FA tutorial

c9e0b1e

- Check correctness for fp8 inputs only when torch supports it - Only run benchmark in fp16

zhanglx13 force-pushed the fix_fa_tutorial branch from a00c25d to c9e0b1e Compare January 25, 2024 18:53

zhanglx13 requested review from vgokhale, scxiao and xiaohuguo2023 January 25, 2024 19:01

xiaohuguo2023 approved these changes Jan 25, 2024

View reviewed changes

zhanglx13 merged commit c631824 into triton-mlir Jan 25, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix FA tutorial #485

Fix FA tutorial #485

zhanglx13 commented Jan 25, 2024

xiaohuguo2023 commented Jan 25, 2024

xiaohuguo2023 left a comment

Fix FA tutorial #485

Fix FA tutorial #485

Conversation

zhanglx13 commented Jan 25, 2024

xiaohuguo2023 commented Jan 25, 2024

xiaohuguo2023 left a comment

Choose a reason for hiding this comment